Improvements on Speech Recogniton for Fast Talkers
نویسندگان
چکیده
The accuracy of a speech recognition (SR) system depends on many factors, such as the presence of background noise, mismatches in microphone and language models, variations in speaker, accent and even speaking rates. In addition to fast speakers, even normal speakers will tend to speak faster when using a speech recognition system in order to get higher throughput. Unfortunately, state-of-the-art SR systems perform significantly worse on fast speech. In this paper, we present our efforts in making our system more robust to fast speech. We propose cepstrum length normalization, applied to the incoming testing utterances, which results in a 13% word error rate reduction on an independent evaluation corpus. Moreover, this improvement is additive to the contribution of Maximum Likelihood Linear Regression (MLLR) adaptation. Together with MLLR, a 23% error rate reduction was achieved.
منابع مشابه
Intelligibility of clear and conversational speech of young and elderly talkers.
It has been documented that talkers can be trained to produce "clear" speech, which is significantly more intelligible for hearing-impaired listeners. In this study, the ability of both younger and older talkers to produce clear speech after a minimal amount of instruction and practice was investigated. Tape recordings were made with the talkers attempting to produce both conversational-style a...
متن کاملWhich Phoneme-to-Viseme Maps Best Improve Visual-Only Computer Lip-Reading?
A critical assumption of all current visual speech recognition systems is that there are visual speech units called visemes which can be mapped to units of acoustic speech, the phonemes. Despite there being a number of published maps it is infrequent to see the effectiveness of these tested, particularly on visual-only lip-reading (many works use audio-visual speech). Here we examine 120 mappin...
متن کاملInvestigating the Narrative Skills of Late Talkers Through Sequential Picture Stories
Objectives: The purpose of the present study is to investigate the oral narrative skills of late talkers mostly caused by mental disorders while they try to comprehend a wordless sequential picture story to create and narrate the relevant story. Methods: To this end, 15 (10 male and 5 female) participants were who were the students of a specialized school for physically and mentally retarded...
متن کاملEffectiveness of computer-based auditory training in improving the perception of noise-vocoded speech.
Five experiments were designed to evaluate the effectiveness of "high-variability" lexical training in improving the ability of normal-hearing subjects to perceive noise-vocoded speech that had been spectrally shifted to simulate tonotopic misalignment. Two approaches to training were implemented. One training approach required subjects to recognize isolated words, while the other training appr...
متن کاملTALKER BACKGROUND AND INDIVIDUAL DIFFERENCES IN THE SPEECH INTELLIGIBILITY BENEFIT by
One way talkers can increase intelligibility is by producing clear speech. Though clear speech, as opposed to conversational speech (ConvS), generally increases intelligibility (known as the clear speech intelligibility benefit), not all talkers exhibit the same degree of benefit. Ferguson showed that while intelligibility increased across talkers for clear speech, when looking at individual ta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999